最佳运输(OT)理论描述了定义和选择在许多可能的选择中,将概率度量映射到另一个概率的最有效方法。该理论主要用于估计,给定一对源和目标概率测量$(\ MU,\ nu)$,这是一个可以有效地将$ \ mu $映射到$ \ nu $的参数化映射$ t_ \ theta $。在许多应用程序中,例如预测细胞对治疗的响应,数据测量$ \ mu,\ nu $(未处理/处理过的单元的功能)定义了最佳运输问题并非孤立地出现,但与上下文$ c $相关联(治疗)。为了说明并将该上下文纳入OT估计,我们介绍了Condot,一种使用上下文标签$ C_I $标记的几对测量$(\ mu_i,\ nu_i)$使用几对测量$(\ mu_i,\ nu_i)$。我们的目标是从标记对的数据集$ \ {(c_i,((\ mu_i,\ nu_i))中提取%\})\} $学习全局映射$ \ mathcal {t} _ {\ theta} $,不仅是预期的适合数据集中的所有对$ \ {((c_i,(\ mu_i,\ nu_i)))\} $,即$,但应概括以产生有意义的地图$ \ Mathcal {t} _ {\ theta}(c _ {\ text {new}})$在未看到的上下文上调节的$ c _ {\ text {new}} $。我们的方法利用并为部分输入凸神经网络提供了新颖的用法,为此我们引入了受高斯近似启发的强大而有效的初始化策略。我们仅使用对所述扰动的作用观察到遗传或治疗性扰动对单个细胞的任意组合对单个细胞的任意组合的影响的能力。
translated by 谷歌翻译
捕获基础数据生成过程的学习表示是数据效率和强大使用神经网络的关键问题。鲁棒性的一个关键属性应捕获,并且最近受到了很多关注,这是由不变性的概念描述的。在这项工作中,我们为学习不变表示形式提供了因果观点和新算法。从经验上讲,我们证明该算法在各种任务中都很好地工作,尤其是我们观察到域概括的最新性能,我们能够显着提高现有模型的分数。
translated by 谷歌翻译
蛋白质复合物形成是生物学中的核心问题,参与了大部分细胞的过程,以及对应用是必不可少的,例如,药物设计或蛋白质工程。我们解决刚性体蛋白 - 蛋白质对接,即计算地预测来自个体未结合结构的蛋白质 - 蛋白质复合物的3D结构,假设在结合期间蛋白质内没有构象变化。我们设计一种新的成对独立的SE(3)-Quivariant的图形匹配网络,以预测旋转和翻译,以将其中一个蛋白质放置在右对接位置相对于第二蛋白质。我们在数学上保证了基本原理:无论两个结构的初始位置和方向如何,预测复合物都是相同的。我们的模型,名为Equidock,近似于绑定口袋并通过最佳传输和可分辨率的Kabsch算法实现,实现了使用关键点匹配和对准的对接姿势。凭经验,尽管没有依赖于沉重的候选抽样,结构细化或模板,我们才能实现显着的运行时间改进,并且通常优于现有的对接软件。
translated by 谷歌翻译
考虑随时间演变的粒子群,通过快照监测,使用在连续时间戳的群体内采样的粒子。仅提供对这些快照的访问,我们可以重建这些粒子的单个轨迹吗?这个问题在我们时代的许多重要科学挑战中,特别是单细胞基因组学。在本文中,我们建议将人口动态模拟为欧洲因果乔丹 - 古德莱尔 - 奥托(JKO)的措施的实现:JKO计划陷入困境,即在时间T + 1的人口采取的新配置是交易的新配置在它减少能量的情况下,群体的更好配置,同时保持关闭(在Wasserstein距离)到在T.中观察到的先前配置。我们在这项工作中的目标是学习这样的能源给定数据。为此,我们提出了JKONET,一种计算的神经结构(以端到端可分子的方式),JKO流量给出了参数化能量和初始配置点。与更直接的前进方法相比,我们展示了JKONET配件程序的良好性能和稳健性。
translated by 谷歌翻译
We study the multiclass classification problem where the features come from the mixture of time-homogeneous diffusions. Specifically, the classes are discriminated by their drift functions while the diffusion coefficient is common to all classes and unknown. In this framework, we build a plug-in classifier which relies on nonparametric estimators of the drift and diffusion functions. We first establish the consistency of our classification procedure under mild assumptions and then provide rates of cnvergence under different set of assumptions. Finally, a numerical study supports our theoretical findings.
translated by 谷歌翻译
In recent years, we have seen a significant interest in data-driven deep learning approaches for video anomaly detection, where an algorithm must determine if specific frames of a video contain abnormal behaviors. However, video anomaly detection is particularly context-specific, and the availability of representative datasets heavily limits real-world accuracy. Additionally, the metrics currently reported by most state-of-the-art methods often do not reflect how well the model will perform in real-world scenarios. In this article, we present the Charlotte Anomaly Dataset (CHAD). CHAD is a high-resolution, multi-camera anomaly dataset in a commercial parking lot setting. In addition to frame-level anomaly labels, CHAD is the first anomaly dataset to include bounding box, identity, and pose annotations for each actor. This is especially beneficial for skeleton-based anomaly detection, which is useful for its lower computational demand in real-world settings. CHAD is also the first anomaly dataset to contain multiple views of the same scene. With four camera views and over 1.15 million frames, CHAD is the largest fully annotated anomaly detection dataset including person annotations, collected from continuous video streams from stationary cameras for smart video surveillance applications. To demonstrate the efficacy of CHAD for training and evaluation, we benchmark two state-of-the-art skeleton-based anomaly detection algorithms on CHAD and provide comprehensive analysis, including both quantitative results and qualitative examination.
translated by 谷歌翻译
The number of international benchmarking competitions is steadily increasing in various fields of machine learning (ML) research and practice. So far, however, little is known about the common practice as well as bottlenecks faced by the community in tackling the research questions posed. To shed light on the status quo of algorithm development in the specific field of biomedical imaging analysis, we designed an international survey that was issued to all participants of challenges conducted in conjunction with the IEEE ISBI 2021 and MICCAI 2021 conferences (80 competitions in total). The survey covered participants' expertise and working environments, their chosen strategies, as well as algorithm characteristics. A median of 72% challenge participants took part in the survey. According to our results, knowledge exchange was the primary incentive (70%) for participation, while the reception of prize money played only a minor role (16%). While a median of 80 working hours was spent on method development, a large portion of participants stated that they did not have enough time for method development (32%). 25% perceived the infrastructure to be a bottleneck. Overall, 94% of all solutions were deep learning-based. Of these, 84% were based on standard architectures. 43% of the respondents reported that the data samples (e.g., images) were too large to be processed at once. This was most commonly addressed by patch-based training (69%), downsampling (37%), and solving 3D analysis tasks as a series of 2D tasks. K-fold cross-validation on the training set was performed by only 37% of the participants and only 50% of the participants performed ensembling based on multiple identical models (61%) or heterogeneous models (39%). 48% of the respondents applied postprocessing steps.
translated by 谷歌翻译
Chain of thought prompting successfully improves the reasoning capabilities of large language models, achieving state of the art results on a range of datasets. However, these reasoning capabilities only appear to emerge in models with a size of over 100 billion parameters. In this paper, we explore the transfer of such reasoning capabilities to models with less than 100 billion parameters via knowledge distillation. Specifically, we finetune a student model on the chain of thought outputs generated by a larger teacher model. Our experiments show that the proposed method improves task performance across arithmetic, commonsense and symbolic reasoning datasets. For example, the accuracy of T5 XXL on GSM8K improves from 8.11% to 21.99% when finetuned on PaLM-540B generated chains of thought.
translated by 谷歌翻译
With the rise of AI in recent years and the increase in complexity of the models, the growing demand in computational resources is starting to pose a significant challenge. The need for higher compute power is being met with increasingly more potent accelerators and the use of large compute clusters. However, the gain in prediction accuracy from large models trained on distributed and accelerated systems comes at the price of a substantial increase in energy demand, and researchers have started questioning the environmental friendliness of such AI methods at scale. Consequently, energy efficiency plays an important role for AI model developers and infrastructure operators alike. The energy consumption of AI workloads depends on the model implementation and the utilized hardware. Therefore, accurate measurements of the power draw of AI workflows on different types of compute nodes is key to algorithmic improvements and the design of future compute clusters and hardware. To this end, we present measurements of the energy consumption of two typical applications of deep learning models on different types of compute nodes. Our results indicate that 1. deriving energy consumption directly from runtime is not accurate, but the consumption of the compute node needs to be considered regarding its composition; 2. neglecting accelerator hardware on mixed nodes results in overproportional inefficiency regarding energy consumption; 3. energy consumption of model training and inference should be considered separately - while training on GPUs outperforms all other node types regarding both runtime and energy consumption, inference on CPU nodes can be comparably efficient. One advantage of our approach is that the information on energy consumption is available to all users of the supercomputer, enabling an easy transfer to other workloads alongside a raise in user-awareness of energy consumption.
translated by 谷歌翻译
Neuromorphic computing using biologically inspired Spiking Neural Networks (SNNs) is a promising solution to meet Energy-Throughput (ET) efficiency needed for edge computing devices. Neuromorphic hardware architectures that emulate SNNs in analog/mixed-signal domains have been proposed to achieve order-of-magnitude higher energy efficiency than all-digital architectures, however at the expense of limited scalability, susceptibility to noise, complex verification, and poor flexibility. On the other hand, state-of-the-art digital neuromorphic architectures focus either on achieving high energy efficiency (Joules/synaptic operation (SOP)) or throughput efficiency (SOPs/second/area), resulting in poor ET efficiency. In this work, we present THOR, an all-digital neuromorphic processor with a novel memory hierarchy and neuron update architecture that addresses both energy consumption and throughput bottlenecks. We implemented THOR in 28nm FDSOI CMOS technology and our post-layout results demonstrate an ET efficiency of 7.29G $\text{TSOP}^2/\text{mm}^2\text{Js}$ at 0.9V, 400 MHz, which represents a 3X improvement over state-of-the-art digital neuromorphic processors.
translated by 谷歌翻译